Cost-Sensitive Decision Trees with Pre-pruning

نویسندگان

  • Jun Du
  • Zhihua Cai
  • Charles X. Ling
چکیده

This paper explores two simple and efficient pre-pruning strategies for the cost-sensitive decision tree algorithm to avoid overfitting. One is to limit the cost-sensitive decision trees to a depth of two. The other is to prune the trees with a pre-specified threshold. Empirical study shows that, compared to the error-based tree algorithm C4.5 and several other cost-sensitive tree algorithms, the new cost-sensitive decision trees with pre-pruning are more efficient and perform well on most UCI data sets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CC4.5: cost-sensitive decision tree pruning

There are many methods to prune decision trees, but the idea of cost-sensitive pruning has received much less investigation even though additional flexibility and increased performance can be obtained from this method. In this paper, we introduce a cost-sensitive decision tree pruning algorithm called CC4.5 based on the C4.5 algorithm. This algorithm uses the same method as C4.5 to construct th...

متن کامل

Decision Tree Pruning Using Expert Knowledge

Decision tree technology has proven to be a valuable way of capturing human decision making within a computer. It has long been a popular artificial intelligence(AI) technique. During the 1980s, it was one of the primary ways for creating an AI system. During the early part of the 1990s, it somewhat fell out of favor, as did the entire AI field in general. However, during the later 1990s, with ...

متن کامل

Cost-sensitive Decision Trees with Post-pruning and Competition for Numeric Data

Decision tree is an effective classification approach in data mining and machine learning. In some applications, test costs and misclassification costs should be considered while inducing decision trees. Recently, some cost-sensitive learning algorithms based on ID3, such as CS-ID3, IDX, ICET and λ-ID3, have been proposed to deal with the issue. In this paper, we develop a decision tree algorit...

متن کامل

Cost-sensitive C4.5 with post-pruning and competition

Decision tree is an effective classification approach in data mining and machine learning. In applications, test costs and misclassification costs should be considered while inducing decision trees. Recently, some cost-sensitive learning algorithms based on ID3 such as CS-ID3, IDX, λ-ID3 have been proposed to deal with the issue. These algorithms deal with only symbolic data. In this paper, we ...

متن کامل

Induction of Modular Classification Rules: Using Jmax-pruning

The Prism family of algorithms induces modular classification rules which, in contrast to decision tree induction algorithms, do not necessarily fit together into a decision tree structure. Classifiers induced by Prism algorithms achieve a comparable accuracy compared with decision trees and in some cases even outperform decision trees. Both kinds of algorithms tend to overfit on large and nois...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007